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Abstract — This paper focuses on the stationary portion 
of file download in an unstructured peer-to-peer network, 
which typically follows for many hours after a flash crowd 
initiation. The model includes the case that peers can have 
some pieces at the time of arrival. The contribution of the 
paper is to identify how much help is needed from the seeds, 
either fixed seeds or peer seeds (which are peers remaining 
in the system after obtaining a complete collection) to 
stabilize the system. The dominant cause for instability 
is the missing piece syndrome, whereby one piece becomes 
very rare in the network. It is shown that stability can be 
achieved with only a smaU amount of help from peer seeds- 
even with very little help from a fixed seed, peers need 
dwell as peer seeds on average only long enough to upload 
one additional piece. The region of stability is insensitive to 
the piece selection policy. Network coding can substantially 
increase the region of stability in case a portion of the new 
peers arrive with randomly coded pieces. 

Keywords: Peer to peer, missing piece syndrome, ran- 
dom peer contact, random useful piece selection, Foster- 
Lyapunov stability, Markov process 



I. Introduction 

Second generation P2P networks such as BitTorrent 
[1], divide a file to be distributed into distinct pieces and 
enable peers (or clients) to share these pieces efficiently. 
BitTorrent, with its rarest first and choke algorithms [1, 
2], has been shown in practice to scale well with the 
number of participating peers [2-8]. 

Understanding how a BitTorrent like P2P system 
works over a long period of time is difficult, due to the 
following details. Each peer maintains a set of neighbors 
it can connect with. According to the choking algorithm, 
a peer unchokes three neighbors from which the peer 
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has the fastest download rate, at the same time it also 
unchokes a randomly chosen neighbor which has pieces 
needed by the peer The choking algorithm works as 
a distributed peer selection mechanism to continuously 
shape the topology of the network; it is influenced by 
heterogeneous link speeds and by the sets of pieces 
available at different peers. Peers track the pieces avail- 
able at their neighbors and the selection of pieces to 
be downloaded is biased towards the rarest pieces first. 
Consequently, analytical models capturing all aspects of 
BitTorrent in detail are intractable. Simulations have re- 
vealed extensive insight about the scalability, robustness, 
and efficiency of P2P networks, but simulations alone 
can cover only a small portion of the range of parame- 
ter values and network settings. Analysis complements 
simulations by helping to identify potential pitfalls and 
as a means to understand and avoid them. 

The following stochastic model of P2P networks is 
examined in [9, 10]. A seed uploads at a constant rate 
Us', peers arrive as a rate A Poisson process; the seed and 
peers apply uniform random peer selection and diverse 
piece selection policies; each peer leaves as soon as it has 
all pieces. It is shown in [9, 10] that the stability region 
is governed by the missing piece syndrome. The missing 
piece syndrome is an abnormal condition appearing 
when there are many peers in the system and all of 
them are missing the same piece. Such a large group of 
peers missing the same piece severely limits the spread 
of the piece in the network. Peers without the missing 
piece quickly join the group and peers with the missing 
piece quickly depart. The main result in [9, 10] is that 
the network may never recover from the missing piece 
syndrome if the upload rate of the seed is less than the 
arrival rate of new peers, and the network is positive 
recurrent if the upload rate of the seed is smaller than 
the arrival rate of new peers. 

This paper extends the basic results of [9, 10] in two 
particular ways: peers can already have some pieces at 
the time of their arrival, and peers can dwell awhile 



in the network after obtaining a complete collection. 
The main result in this paper, Theorem 1, provides 
the stability region of the network within the space of 
values of arrival rates, seed uploading capacity, and peer 
dwelling time. The proof of the main result is shaped by 
showing that the system either is trapped by the missing 
piece syndrome, or that it always escapes the missing 
piece syndrome, depending on the parameter values. 
This paper reveals the least amount of time peers must 
dwell after obtaining the entire file so that the whole 
network is positive recurrent. A corollary of our result 
is that if each peer can upload one additional piece after 
obtaining the whole file before departing, the network 
is stable under any positive seed uploading capacity and 
any arrival rates. In BitTorrent, the size of a single piece 
is typically a small fraction of the entire file (about 0.5%) 
so that it is a light burden for a peer to dwell in the 
network long enough to upload one more piece after 
obtaining a complete collection. The proof techniques 
are similar to those used in [9, 10], but are modified 
to handle the more general model here. For the proof 
of the positive recurrence for other parameter values, 
a Lyapunov function is used as in [9, 10], but it is no 
longer quadratic, and a variation of the standard big "O" 
notation is introduced. There are quadratic terms in the 
Lyapimov function, but some related terms are added to 
cover the case that sufficient downloading capacity has 
to build up as new arrivals bring new pieces with them. 

Four extensions to Theorem 1 are also presented 
in this paper. The first extension is to point out that 
Theorem 1 remains true for a wide variety of piece 
selection policies, as long as they select useful pieces 
when present, and the same uniform, random peer selec- 
tion policy is used. The second extension is to point out 
how Theorem 1 can be modified to incorporate network 
coding. Such an extension was also given in [10] for the 
less general model there, which specified, in particular, 
that peers have no pieces when they arrive. In that 
context it was shown in [10] that network coding does 
not increase the region of stability of the peer to peer 
system. In contrast, we find here that when peers arrive 
with some (randomly coded) pieces, network coding 
substantially increases the region of stability. The third 
extension addresses variations of the model such that 
the time between two consecutive transfer attempts is 
reduced if there is no useful piece to transfer. The fourth 
extension is to consider the borderline case, between the 
necessary and sufficient conditions of Theorem 1 . 

The organization of the paper is as follows. Related 
work is presented in Section 11. The network model and 
Theorem 1, the main result of this paper, are described 



in Section III. Section IV presents three examples that 
illustrate Theorem 1. Section V presents an outline 
of the proof of Theorem 1, while the detailed proof 
itself is given in Sections VI and VII, which prove the 
transience and positive recurrence parts of Theorem 1, 
respectively. The extensions to Theorem 1 are given in 
Section VIII, and a brief conclusion is given in Section 
IX. Miscellaneous results used in the main part of the 
paper are summarized in the appendix. 

II. Related Work 

This section briefly points to work related to stability 
and the missing piece syndrome in BitTorrent like P2P 
networks with models similar to the one here. Like 
this paper, the paper of Massoulie and Vojnovic [11] 
assumes that peers having various collections of pieces 
arrive according to Poisson processes, although there is 
no seed. The analysis given in [11] is based on scaling 
the initial state and the arrival rates by a parameter that 
goes to infinity. The asymptotic analysis gives rise to 
a fluid limit, described by a vector ordinary differential 
equation. The existence of a symmetric equilibrium point 
of the fluid limit is established. Like this paper, the 
paper of Leskela, Robert, and Simatos [12] considers the 
case of each peer dwelling awhile after it has obtained 
a complete collection. The case in which a file is not 
divided at all, and the case in which a file is divided 
into two pieces that must be collected by all peers in the 
same order, are considered, and the required mean dwell 
time is identified for stabilizing the system. Models in 
[9-12] are discussed as special cases of the model in this 
paper. 

Two-piece P2P models under slightly different as- 
sumptions are studied in [13], and essentially the same 
stability condition as in [9, 10] is obtained for the two- 
piece special case. By modeling BitTorrent as multiple 
M/G/qo queues, the authors in [5] provide closed form 
steady state distributions and study the self-sustainability 
of their systems. In the simulation of [5], the authors 
find their "smooth download assumption" and "swarm 
sustainabUity" break down if the seed upload capacity is 
small; this is evidence of the missing piece syndrome. 

The BitTorrent choking algorithm has attracted con- 
siderable interest from researchers, due to its ability to 
encourage reciprocity and increase scalability. Based on 
experiments for the case of flash crowds in BitTorrent, 
the authors in [2] concluded that the choke algorithm and 
rarest first piece selection together can foster reciproca- 
tion and guarantee close to ideal diversity of the pieces 
among peers. It is worth noting that the experiment in 
[2] about transient states, which appear because of the 
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upload constraint of the seed, gives evidence of the miss- 
ing piece syndrome. In [14], the authors show that the 
choking algorithm can faciUtate the formation of clusters 
of similar-bandwidth peers. The authors measured the 
performance of BitTorrent protocols on a PlanetLab 
platform, and discovered that when the seed upload 
capacity is high, peers mainly upload to other peers 
with roughly the same bandwidth. But when the seed 
upload capacity is low, such clustering of peers does not 
emerge. In [15], the authors compare direct reciprocity, 
where users exchange contents directly, and indirect 
reciprocity, where users upload contents based on credits 
of their targets. They show that an indirect reciprocity 
schedule can be replaced by a direct reciprocity schedule 
with a loss of efficiency at most a half if users can 
restore undemanded contents for bartering. They also 
provide simulations showing the benefits of having a 
public board which announces the content distribution 
and having a matchmaker which pairs users together by 
a maximum weight matching algorithm. 

Papers [16-18] concern concurrent delivery of multi- 
ple files in a P2P network. Peers can store files they 
do not request in order to increase reciprocation and 
efficiency of file distribution. Models about single-piece 
file sharing through mobile networks are studied in [17, 
18]. In [17] the authors suppose multiple single-piece 
files are to be downloaded by some of the peers, and 
peers store and exchange files they do not request. 
Assuming Poisson arrivals and random peer contact, the 
authors establish fluid Umits for a broad family of file 
exchanging policies, and derive the stability region for a 
static-case policy (peers do not exchange files unless they 
can get their requested files). They further show that by 
mixing multiple swarms together the network scalability 
is increased in the sense that only one swarm can become 
unstable. In [16] the authors discuss multiple-channel 
live streaming and show how the performance increases 
if some peers can apply their spare capacity to distribute 
channels they are not watching. Papers [19,20] are also 
about live streaming by P2P networks. In [20] the authors 
provide a simple queue model to compare rarest first and 
greedy piece selection pohcies in P2P live streaming, and 
propose a mixed selection policy to balance the trade-off 
between start-up latency and continuity. 

Network coding can improve the network perfor- 
mance. Network coding was first proposed in [21], where 
it is shown that a sender can communicate information 
to a set of receivers if the min-cut max-flow bound is sat- 
isfied for connections to each receiver. Simulations with 
network coding applied in P2P file distribution described 
in [7] show that in a P2P network under topologies with 



bad cuts, network coding can provide a much higher 
average file distribution rate than that provided without 
coding or with source coding only. Better robustness also 
appears when network coding is simulated on a P2P 
network with dynamic arrivals and departures. In [22] 
the authors study a gossip model under random linear 
coding, with each peer initially having a single unique 
piece, and all peers are to coUect all pieces, and peers are 
assumed to apply random contact and transmit random 
linear combinations of the messages they own to their 
targets. It is shown in [22] that with network coding, 
the gossip can be completed in time proportional to the 
number of peers, with high probability. The paper [19] 
focuses on the efficiency of network coding for P2P 
live streaming. It shows that when network coding is 
applied and a distributed, stochastic version of a primal- 
dual algorithm is used, then a fluid scale Umit admits a 
cost optimal operating point as a fixed point. Network 
coding is considered in [10] for the assumptions of that 
paper (peers arrive with no pieces, there is a fixed seed, 
and peers depart after obtaining a complete collection). 
In that context, while network coding eliminates the need 
for peers to exchange lists of pieces, the condition for 
stability is nearly the same as for random useful piece 
selection without network coding. 

III. Model and Result 

The model discussed in this paper is a combination of 
related models in [3,4, 11]. In a single fixed seed P2P 
network, a large file is divided into K pieces, for some 
K >1, which are stored in the fixed seed. The fixed seed 
is not considered to be a peer. Each peer in the system 
holds some subset of the pieces. For any subset C of 
the total collection of pieces {1, 2, ...K}, a peer holding 
the collection of pieces C is called a type C peer. In 
some real P2P networks, peers can get some pieces from 
a tracker upon their arrival for initialization. To capture 
that case, we assume type C peers arrive into the system 
at times of a Poisson process with rate A,^. Although we 
consider all possible values of (Ac, C € C), typically in 
practice, Ac is small or equal to zero when \C\ > 1. 

The fixed seed and all peers use the random peer 
contact and random useful piece selection strategies at 
instants of Poisson processes, with the contact-upload 
rate of the fixed seed denoted by Us and the contact- 
upload rate of any peer denoted hy ji^ji > 0. Specifically, 
suppose the fixed seed and each peer maintain internal 
Poisson clocks; the clock of the fixed seed ticks at rate 
Us, and the clock of any peer ticks as rate ji. Whenever 
the clock of the fixed seed ticks, the fixed seed contacts 
a peer, say peer A, which is selected uniformly from 



3 



among all peers. According to the random useful piece 
selection strategy, the fixed seed checks to see if A needs 
any pieces, and uploads to A the copy of one piece 
uniformly chosen from among the pieces needed by A. 
If A does not need any pieces (because A is a peer seed), 
no piece is uploaded and the fixed seed remains silent 
between clock ticks. 

A peer similarly uploads pieces. When its rate fj, 
Poisson clock ticks, it contacts a peer selected at random, 
and checks to see whether it has pieces needed by the 
contacted peer If the answer is yes, it uploads to the 
contacted peer a copy of a piece uniformly chosen from 
among its pieces needed by the contacted peer; if the 
answer is no, no piece is uploaded and the peer does not 
upload pieces between clock ticks. The peer contacts and 
piece uploads of the fixed seed and peers are assumed 
to be instantaneous. 

In a real P2P network, peers may upload two or more 
pieces to different peers at the same time, and peer 
selection, peer contact and piece upload are not instan- 
taneous. For mathematical simplification, we consider 
a homogeneous network with the maximum number of 
upload links of each peer limited to one, and apply the 
waiting times of Poisson clocks to model the total time 
consumed for peer selection, contact, and piece upload. 
So and l/Ug are approximately the average piece 
transmission time from peer to peer and from the fixed 
seed to peer in a real P2P network. 

Assume that each peer, after becoming a peer seed, 
dwells in the system for an exponentially distributed 
length of time with mean I/7, with < 7 < 00. The 
case 7 = 00 is shorthand notation for the case that peers 
depart immediately after collecting all pieces. Intuitively, 
smaller values of 7 yield better system performance, 
because peer seeds can upload more pieces if they stay 
in the system longer. Our result identifies the smallest 
mean peer seed dwelling time (i.e. largest 7) sufficient 
for a stable system. If the rate Us of the fixed seed is 
sufficiently large, or if the rates Ac are large enough 
for some nonempty C, the system can be stable even if 
peers do not become peer seeds (i.e. even if 7 = 00). The 
arrivals of new peers, the peer seed dwell times, and the 
ticking of Poisson clocks, are mutually independent. The 
notation and assumptions of the model are summarized 
as follows: 

• C : Set of all subsets of = {1, . . . , K}, where 
K > 1 is the number of pieces, and T is the 
collection of all pieces. 

• Type C peer: A peer with set of pieces C e C is a 
type C peer, which becomes a type Cu{i} peer if 
the seed or another peer uploads piece i ^ C to it. 



A type T peer is also called a peer seed. 

• Type C group: The set of type C peers in the 
system. 

• Arrivals: Exogenous arrivals of type C peers form 
a rate Ac & [0, 00) Poisson process. To avoid 
triviality, assume the total arrival rate of peers — 
>^totai = J2c:Cec — IS Strictly positive. Also, 
without loss of generality, if 7 = 00, assume 
A^ = 0. 

• Random peer contact: The fixed seed contacts a 
uniformly chosen peer at instants of a Poisson 
process with rate Us e [0, 00). Every peer contacts 
a uniformly chosen peer at instants of a Poisson 
process with rate fi e (0, 00). 

• Random useful piece upload: When A contacts B, 
if B does not have all pieces that A has, A uploads 
to B a copy of one piece uniformly chosen from 
among the pieces A has but B does not have. 
Otherwise no piece is uploaded. 

• Departures: If 7 e (0, cjo), every peer becomes a 
peer seed after obtaining all K pieces, and subse- 
quently remains in the system for an exponentially 
distributed length of time with mean I/7 before 
departing. If 7 = 00, then Ajr = and peers depart 
inmiediately after obtaining all K pieces. 

Under the assumptions above, the system is a Markov 

\c I 

chain with state vector x = {xc : C £ C) £ if 
7 G (0, 00), and X = (xc : C e C - {T}) S Z^+^~'^ if 
7 = 00, where xc is defined to be the number of type 
C peers, except we define xc = in the case C = T 
and 7 = 00. Define Tc,c' for C,C' gC as follows: 

XC I Us ^ xs \ 

if n > 1 and C = C U {i} for some i G T — C, and 
Fee = else, where n := J2c-cec is the total 
number of peers. In words, unless C = T and 7 = 00, 
Fee is the aggregate rate of transition of peers from 
type C to type C"; If C" = J" and 7 = 00, Fc,c' is the 
aggregate rate of departures from the system of peers of 
type C. 

Let ec denote the vector with the same dimension 
as X, with a one in position C and other coordinates 
equal to zero. The positive entries of the generator matrix 
Q = (g(x, x')) are given by: 

. if 7 e (0,00), x= (xc : C eC), 

q{x, x + ec) = Ac 
g(x, X - ejr) = 7Xjr 
g(x, X - ec + ecu{i}) = rc,cu{i},if « ^ C. 
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. if 7 = oo, X = [xc ■■ C eC - {J^}), 

q{x,x + ec) = Ac 
q{x,x-ec + ecu{i}) ^^c.cu{t}Ai 

\C\<K~2,iiC. 
q(x,x-ec) =rc,^, if |C| 

The following theorem, which is the main result of this 
paper, describes the stability region of the P2P system. 

Theorem 1. Let Us e [0, oo), e (0, oo), 7 e (0,oo], 
{Ac : C e C, Ac G [0, 00)} with \jr = if ^ = 00, and 
\otai > be given. 

(a) The Markov process with generator matrix Q is 
transient if either of the following two conditions is true: 

• 0</i<7<oo and for some fc G J^, 

^. + Ec:fegcAc(i^ + l-|C|) 

Motal > \ — 1^ 

7 

• < 7 < /i and for some piece k € T, no copies of 
piece k can enter the system. 

(b) Conversely, the process is positive recurrent and 
E[N\ < 00, where N denotes a random variable with 
the stationary distribution of number of peers in the 
system, if either of the following two conditions is true: 

• 0</i<7<oo and for any k J-, 

Us + Ec-.kec^cjK+l-lCl) 

Motal < Jl ■ (j) 

7 

• < 7 < /i and for any k d J-, it is possible for 
new copies of piece k to enter the system. 

We remark that when we say new copies of piece k 
can enter the system, we mean > or Ac > for 
some C E C such that k E C. And we remark that 
condition (3) holding for all fc € is equivalent to the 
following: for any S E C — {J'}, 

C:C<ZS 




<0. (4) 
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In particular, (4) holds for all 5 G C - {J"} if it holds 
for all S E {T - {k} : k E T}. 



IV. Three Examples 

To illustrate Theorem 1, we examine the three exam- 
ples of P2P networks shown in Figure 1 . 



9 Fixed seed 




(c) K = 3 



Fig. 1. Examples 

Example 1: This example is treated in [12]. As shown 
in Figure 1(a), the file is transferred as a single piece, 
that is, K — 1. New peers without any piece arrive into 
the system at the times of a Poisson process with rate Aq. 
After obtaining the piece a peer becomes a peer seed. At 
rate Us, the fixed seed contacts and uploads the piece to 
new peers, which become peer seeds after obtaining the 
piece. When peer seeds are in the system, they randomly 
contact and upload copies of the piece to new peers with 
rate /i, which creates more peer seeds. After staying for 
an exponentially distributed time period with mean I/7, 
a peer seed leaves the system. This example illustrates 
our model with parameters K — 1, Us,iJ.,j, A0 = Ao G 
(0, 00), and A{i} — 0. 

The stability of a system is determined by its ability 
to recover from a heavy load. First consider the case that 
there are many peer seeds in the system. Because every 
peer seed departs at rate 7, in essence, the service rate 
jxjr scales linearly with the number of peer seeds, xjr, 
as in an infinite server system, so the system can recover 
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no matter how many peer seeds there are. Secondly 
consider the case that there are many type peers and 
few peer seeds. For a long time period, when the fixed 
seed or a peer seed randomly contacts a peer to upload a 
piece, the probability they contact a type peer is close 
to one. So the group of type peers receives uploads 
from the fixed seed at rate almost Ug. Once a peer 
becomes a peer seed, it can upload more pieces to type 
peers, creating more peer seeds, which upload more 
pieces. So every peer seed can create a branching process 
of departures from the type group. The mean amount 
of time a peer seed stays in the system is I/7, and during 
its stay it uploads pieces to type peers at rate close to 
/i. So on average, a peer seed can upload to type 
peers. By the theory of branching process, if fi/-f > 1, 
the expected number of descendants of a peer seed is 
infinite, which stabilizes the process. If /i/7 < 1, on 
average every peer seed has jz^^Jj descendants. Hence, 
every upload of the piece by the fixed seed to a type 
peer causes, on average, about departures from 

the type group. Comparing to Ao, the arrival rate of 
type peers, this suggests that the system is stable if 
either /i > 7, or /i < 7 and Ag < Us 1 _"'"/„ ■ Conversely, 
if /i > 7 and Aq > Us i-jt/^ ' the arrival rate of type 
peers is larger than the average rate of departures 
from the type group, indicating that the system cannot 
always recover from the heavy load of type group and 
so it is unstable. This conclusion is confirmed by [12] 
and Theorem 1. 

Example 2: As shown in Figure 1(b), the file is 
divided into four pieces, that is, K = 4. There are 
two types of new peers, type {1,2} and type {3,4}, 
which arrive as two independent Poisson processes 
with respective rates A12 and A34. There is no fixed 
seed in the system. Peers contact and upload pieces 
to each other so that they can depart. Peers depart 
immediately after obtaining all four pieces; there are 
no peer seeds in the system. This example illustrates 
our model with parameters K ^ 4, Us = 0, j = 00, 
M)-^{i,2} = Ai2,A{3,4} = A34 e (0,00), Ac = for 
{1,2}, {3, 4}. 

Consider the ability of the system to recover from a 
heavy load. First, consider the network starting from a 
state such that all peers are type {1, 2, 4} and there are 
so many type {1,2,4} peers that the fraction of them 
among all peers is close to one for a long time. On one 
hand, most new type {1,2} peers download piece 4 from 
a type {1,2,4} peer and join the type {1,2,4} group, 
so the arrival rate of type {1, 2, 4} peers is close to A12. 
On the other hand, most new type {3, 4} peers download 
pieces 1 and 2 from type {1, 2, 4} peers and then depart. 



with an expected lifetime in the system approximately 
^. During its lifetime, a type {3,4} peer uploads piece 
3 to two type {1,2,4} peers on average and thereby 
induces two departures on average. So the medium term 
aggregate departure rate of type {1,2,4} peers is close to 
2A34. Hence, if A12 < 2A34, the system is able to recover 
from a heavy load of type {1, 2, 4} (or {1, 2, 3}) peers. 
Conversely, if the inequality goes the other way, that is, 
A12 > 2A34, the arrival rate of type {1,2,4} peers is 
larger than the aggregate departure rate of type {1, 2, 4} 
peers. So the type {1,2,4} group will keep growing. 
Thus if A12 > 2A34 the system cannot always recover 
from a heavy load of type {1, 2, 4} (or {1, 2, 3}) peers. 
Similarly, if A34 < 2Ai2 the system can recover from a 
heavy load of type {2, 3, 4} (or {1, 3, 4}) peers. And the 
system cannot always recover from the same heavy load 
if A34 > 2Ai2. 

The situation is similar if there is a heavy load of 
type {1, 2} (or {3,4}) peers, while the other groups are 
empty. The arrival rate of type {1,2} peers is A12. The 
aggregate departure rate of type {1,2} peers, from the 
uploads of both type {3, 4} peers and type {1, 2, a;}, a; = 
3, 4 peers (which are former type {1.2} peers), is larger 
than 2A34. So if A12 < 2A34 the system is able to recover 
from the heavy load of type {1,2} peers. 

Secondly, consider the case that there are heavy loads 
in groups of at least two types, e.g. type {1,2} and 
{1,2,3}. There is at least one type of peer that can 
upload to the other type of peer, e.g. type {1, 2, 3} peers 
can upload to type {1,2} peers. There are many uploads 
from type {1, 2, 3} peers to type {1, 2} peers so that the 
departure rate from the type {1,2} group is large, which 
stabiUzes the system. This suggests that the system is 
stable if A12 < 2A34 and A34 < 2Ai2, and unstable if 
either A12 > 2A34 or A34 > 2Ai2. This conclusion is 
confirmed by Theorem 1. 

Example 3: As shown in Figure 1(c), the file is 
divided into three pieces, that is, K = 3. New peers 
arrive at a total rate Xtotai, and each peer arrives with 
one piece, having piece i with probability Xi/ Xtotai- So 
there are three types of new peers, type {1}, type {2}, 
and type {3}, which arrive as three independent Poisson 
processes with rates Ai, A2 and A3, respectively. There 
is no fixed seed in the system. At rate /t each, peers 
randomly contact and upload pieces to each other. After 
collecting all three pieces, every peer stays in the system 
as a peer seed for an exponentially distributed time with 
mean 1/7,7 > This example illustrates our model 
with parameters K = 3, Ug = 0, < /J. < j < 00, 
A{i} = Ai, A{2} = A2, A{3} = A3 e (0, 00), Ac = for 
\C\ ^ 1. 
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Consider whether the system can recover from a heavy 
load. First, consider the network starting from a state 
such that all peers are type {1,2} and there are so many 
type {1,2} peers that the fraction of them among all 
peers is close to one for a long time. By the reasoning 
of example two, almost every new type {1} and type 
{2} peer joins the type {1,2} group, so the arrival rate 
of the type {1,2} group is close to Ai + A2. Over the 
medium term, every new type {3} peer has an expected 
lifetime approximately ^ + ^, with ^ being the expected 
time for the type {3} peer to download two pieces from 
type {1,2} peers, and with ^ being the expected time 
for the type {3} peer to be a peer seed. During its 
lifetime every type {3} peer uploads approximately 2+^ 
pieces to type {1,2} peers on average. By the reasoning 
of example one, every peer seed creates a branching 
process of departures of type {1,2} peers, with the total 
number of new peer seeds (including the root) equal 
to jz^^- Thus, on average, every new type {3} peer 
induces (2+^) iJ^^/^ departures from type {1, 2} group, 
so the medium term aggregate departure rate of type 
{1, 2} peers is approximately A3(2 + — 1777-. Hence if 



•^1 + ^2 < A3(2 + ^) the system is able to recover 

from a heavy load of type {1,2} group. Conversely, if 
Ai + A2 > A3(2 + ^) 13^, type {1, 2} group will keep 
increasing and the system cannot always recover from 
the heavy load. Similarly, if A2 + A3 < Ai(2+ ^) 
or Ai + A3 < A2(2 + 7 ) i-]t/7 ' system is able to 
recover from a heavy load of type {2,3}, or {1,3} 
group. And if either of the two inequalities is reversed, 
the system cannot always recover from a corresponding 
heavy load. 




(e) one club 
(members have all pieces 
except piece one) 



(a) normal v . 



young peers 



(9) 

gifted 

peers 



Fig. 2. Flow of peers (solid lines) and pieces (dashed lines) in the 
system. 



condition becomes 

'Ai + A2 < 2A3 
A2 + A3 < 2Ai 
Ai + A3 < 2A2 

If Ai,A2,A3 are not all equal, at least one equality is 
reversed, so the system is unstable. This special case 
when 7 = 00 is considered in [11], and is discussed in 
Section VIII-D below. 



Secondly, through considerations similar to those in 
example one and two, we can see that the conditions 
of heavy load in other single-type group or heavy load 
in multiple-type groups can also be recovered from if 
the three inequalities above hold. This suggests that the 
system is stable if 



'Ai + A2 < A3(2 + 
A2 + A3 < Ai(2 + 
_Ai + A3 < A2(2 + 
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If any one of the three inequalities is reversed, it indicates 
the system is unstable. This is consistent with Theorem 
1. Note that if peers depart immediately after obtaining 
a complete collection (i.e. 7 = 00), then the stability 



V. Outline of the Proof 

The analysis of the above three examples suggests that 
when we consider the system to be in heavy load, the 
worst distribution of load is that nearly all peers have 
the same type C with |C| = K — 1. If the system is able 
to recover from that kind of heavy load, it can recover 
from other kinds of heavy load. With this intuition in 
mind, a sketch of the proof of Theorem 1 is offered as 
follows. 

First, we sketch the proof of Theorem 1(a) about 
transience when < /i < 7 < 00. Without loss 
of generality, assume that (2) is true for fc = 1, or 
equivalently, Ajr-{i} > 0. 

Consider the following partition of peers into five 
groups, as shown in Figure 2. 

• Normal young peer. A normal young peer is a peer 
that is missing at least two pieces, one of them being 
piece one. 
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• Infected peer. An infected peer is a peer that ob- 
tained piece one after arriving, but before obtaining 
all the other pieces. Once a peer is infected, it 
remains infected until it leaves the system; it is 
considered to be infected even when it is a peer 
seed. 

• Gifted peer. A gifted peer is a peer that arrived with 
piece one. A gifted peer is gifted for its entire time 
in the system; it is considered to be gifted even 
when it is a peer seed. 

• One-club peer. A one-club peer is a peer that has 
all pieces except piece one. That is, the one-club is 
the group of peers of type {2, 3, ...K^. 

• Former one-club peer. A former one-club peer is a 
peer in the system that is not a one-club peer but at 
some earlier time was a one-club peer. Note that a 
former one-club peer is a peer seed. The converse 
is not true, because infected peers and gifted peers 
can be peer seeds. 

Consider the system starting from an initial state in 
which there are many peers in the system, and all of 
them are one-club peers. The system evolves as shown 
in Figure 2. Piece one can arrive into the system from 
outside the system in two ways: uploads by the fixed seed 
or arrivals of gifted peers. Ignore for a second the effect 
of normal young peers getting piece one (and becoming 
infected). Most of the uploads by the fixed seed are 
uploads of piece one to one-club peers. One such upload 
creates a new peer seed, which on average will upload 
piece one to about /u/7 more one-club peers, and each of 
those will upload piece one to about /Lt/7 more one-club 
peers, and so forth, in a branching process. Each upload 
of a piece by the fixed seed thus ultimately causes, 
on average, about jz^^ departures from the one-club. 
Each gifted peer, with type C on arrival, for some C 
with 1 e C, will directly upload to, on average, about 
K — \C\ + n/^ one-club peers, and those will become 
peer seeds which also could upload to about /i/7 more 
one-club peers, and so fourth, so that the total expected 
number of one-club departures caused by the type C 
gifted peer is {K — |C| + /i/7) ■ Summing these 
quantities and subtracting them from the arrival rate of 
peers without piece one gives Ajf_|ij. So Ajr_|i} > 
indicates that the arrival rates of peers missing piece one 
is larger than the upload rate of piece one, causing the 
one-club size to grow linearly with time. 

The above analysis neglects the possibility that normal 
young peers can also receive piece one, creating infected 
peers. An infected peer can upload to one club peers, cre- 
ating former one-club peers, and to normal young peers, 
creating more infected peers. This results in a branching 



process comprised of infected peers and former one-club 
peers. By the theory of branching process, the expected 
number of infected offspring of a former one-club peer 
or an infected peer will converge to zero, as the fraction 
of one-club peers converges to one. Hence, when the 
one-club is large enough, the existence of infected peers 
does not appreciably affect the growth of the one-club. 
The detailed proof of transience is offered in Section VI. 

Second, we sketch the proof of Theorem 1(b) about 
positive recurrence for the case < < 7 < 00 under 
the assumption that (4) is valid for all S* € C — {J^}. The 
above discussion suggests that when Ajr-{i} < 0, the 
departure rate of the one-club is larger than the arrival 
rate of peers missing piece one, therefore, the system has 
the ability to recover from a single heavy load in the one- 
club. Moreover, when /c = 2, 3, ...K and there is a single 
heavy load in the type F — {fc} group, similar reasoning 
suggests that the system can recover if Ajr_|fe} < 0. 
To get a better idea of the proof, here we consider other 
distributions of heavy load. 

• Suppose there is a single heavy load in some type 
S group with \S\ < K — 2. Uploads from the 
fixed seed (with rate Us) and from new peers 
holding pieces not in S (with rate X^c cgs ^c) 
keep creating departures from the type S group. 
If we ignore the period of time from when a 
peer departs from the type S group until the same 
peer becomes a peer seed, we see that the average 
remaining lifetime of every peer which departs 
from the type S group is greater than or equal 
to In this lifetime the peer uploads on average 
approximately fi/j pieces to type S peers, which 
creates more departures from the type S group. 
Including the root, every departure from the type 
S group can ultimately cause at least iJ''^/^ depar- 
tures from the type S group, on average. Because 
every new type C peer with C % S eventually 
uploads on average if — |C| +/i/7 pieces to type S 
peers, the departure rate of type S group is larger 



than 



Us 



■T.C:C^S^C{K-\C\+Hh)_ 

Because peers mainly download pieces from type 
S peers, almost all new type C peers with CCS 
ultimately join the type S group. So the near term 
arrival rate of type S group is less than but close 
to '^c-ccs ^c, which is smaller than the aggregate 
departure rate of type S peers by (4). So the system 
can recover from the heavy load. 
Suppose there is a single heavy load in the type 
group, that is, the group of peer seeds. The 
departure rate of peer seeds, "fXj^, scales linearly 
with the number of peer seeds, xj^, as in an infinite 
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server queueing system. So the system can recover 
however large the group of peer seeds is. 

• Suppose there are heavy loads in at least two groups 
of different types, say types Ci and C2. In this 
condition, either Ci ^ C2 or C2 ^ Ci is true, 
so peers in at least one of the groups, say Ci, can 
upload pieces to peers in the other group, say C2. 
The rate of peers departing from the type C2 group 
is quite high, due to the large rate of uploads from 
type Ci peers, so the system can quickly escape 
from that region of the state space. 

The above paragraphs summarize how the system 
can recover from all distributions of heavy load. To 
provide a proof of stability it must also be shown that 
the load cannot spiral up without bound through some 
oscillatory behavior. For that we use a Lyapunov function 
and apply the Foster-Lyapunov stability criterion. The 
detailed proof is offered in Section VII. 

VI. Proof of Transience in Theorem 1 

In the following the detailed proof of Theorem 1(a) is 
given. It is obvious the system is transient if no copies of 
piece k can enter the system. Without loss of generality, 
assume < yu < 7 < 00 and assume Aj7__^iy > 0. For a 
given time i > 0, define the following random variables, 
using the terminology of Section V and Figure 2: 

• Fj" : number of normal young peers (group (a)) at 
time t. 

• : number of infected peers (group (b)) at time 
t. 

• Yf : number of gifted peers (group (g)) at time t. 

• Y^ : number of one-club peers (group(e)) at time 
t. 

• y/ : number of former one-club peers (group (f)) 
at time t. 

• At : cumulative number of arrivals, up to time t, of 
peers without piece one at time of arrival 

• Dt : cumulative number of downloads of piece one, 
up to time t. (Peers arriving with piece one are not 
counted.) 

• Nt : number of peers at time t. 

The system is modeled by an irreducible, countable- 
state Markov chain. A property of such random pro- 
cesses is that either all states are transient, or no state 
is transient. Therefore, to prove Theorem 1(a), it is 
sufficient to prove that some particular state is transient. 
With that in mind, we assume that the initial state is the 
one with No peers, and all of them are one-club peers, 
where No is a large constant specified below. Given a 
small number ^ with < ^ < 1, let r be the extended 
stopping time defined by r = mm{t > -.Y^ + y/ < 



(1 — ^)Nt}, with the usual convention that r = 00 if 
y/ + Y/ > (1 - e,)Nt for aU t. It suffices to prove that 

P{t = 00 and lim Nt = +00} > 0. (5) 

The probability of the event in (5) depends on only the 
out-going transition rates for states such that Y'^ + Y^ > 
(1 — ^)N. Thus, we can and do prove (5) instead for an 
alternative system, that has the same initial state, and 
the same out-going transition rates for all states such 
that Y"" + Yf > (1 - 0-^5 as the original system. The 
alternative system, however, guarantees an upper bound 
on the aggregate rate of downloads of piece one by peers 
in group (a), and a lower bound on the rate of downloads 
by the set of peers in groups (a), (e) and (f). This can be 
done so that the alternative system has the following six 
properties, the first four of which hold for the original 
system, and the last two of which hold for the original 
system on the states with Y" + Y^ > (1 - ^)iV : 

1) A peer with a complete collection departs accord- 
ing to an exponentially distributed random variable 
with parameter 7. 

2) Each peer in group (b), (g), or (f) uploads to the 
set of peers in group (e) with rate at most /j.. 

3) The fixed seed uploads to the set of peers in group 
(e) with rate at most Us ■ 

4) Peers in group (a) can download piece one only 
from peers in groups (b), (g), or (f), or from the 
fixed seed. 

5) Whenever the internal Poisson clock of a peer in 
group (b), (g), or (f), or the fixed seed, ticks, the 
probability the tick results in contacting a peer in 
group (a) to upload to is less than or equal to ^. 

6) A peer in group (a), (b), or (g) that is not yet a seed 
peer receives usable download opportunities at rate 
greater than or equal to (1 — (If Y'^ + Y^ > 
(1 — ^)N, these opportunities can be provided by 
the peers in groups (e) and (f). ) 

The alternative system can be defined by supposing 
that on the states with + Y^' < (1 - C)A^ : (i) 
the opportunities for peers in groups (b), (g) or (f), or 
the fixed seed, to download to peers in group (a) are 
discarded with some state-dependent positive probability, 
and (ii) there is a phantom seed, having all pieces except 
piece one, that uploads pieces to peers in groups (a), (b), 
or (g) as necessary for property 6) above to hold. For 
the remainder of this proof we consider the alternative 
system, but for brevity of notation, use the same notation 
for it as for the original system, and refer to it as the 
original system. 

Only peers in groups (a) and (e) download piece one; 
peers in the other three groups already have piece one. 
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A peer in group (a) downloading piece one immediately 
moves to group (b), and a peer in group (e) downloading 
piece one immediately moves to group (f). Thus, a 
download of piece one creates either a group (b) peer or 
a group (f) peer. A group (b) peer or group (f) peer stays 
in the same group until it leaves the system. While a peer 
in group (b) or (f) is in the system it can generate more 
peers in groups (b) and (f) by uploading piece one, and 
those peers are considered to be offspring spawned by 
the peer. Since offspring can themselves spawn offspring, 
there is a branching process, and a group (b) or group 
(f) peer has a set of descendants. 

We shall consider the evolution of a portion of the 
system under some statistical assumptions that are dif- 
ferent from those in the original system. We refer to 
it as the autonomous branching system (ABS) because 
strong independence assumptions are imposed. The ABS 
pertains only to those peers that have piece one. It is 
shown below that the original system can be stochas- 
tically coupled to the ABS so that uploads of piece 
one happen in the original system only when they also 
happen in the ABS. We begin by considering only group 
(b) and group (f) peers. In the original system, a group 
(b) peer was formerly a group (a) peer, and a group 
(f) peer was formerly a group (e) peer; such previous 
history is irrelevant for the system under the ABS; the 
description below concerns such a peer only from the 
time it becomes a group (b) or group (f) peer. The 
statistical assumptions for the ABS involving these peers 
are as follows: 

• A group (b) peer is required to download K — 1 
pieces; usable opportunities for such downloads ar- 
rive according to aPoisson process of rate — ^). 
(The interpretation is that, when a group (b) peer 
appears, any piece it might have had besides piece 
one is ignored or discarded.) After the -ftT — 1 
downloads, the group (b) peer remains in the system 
as a seed peer for a seed dwell duration that is 
exponentially distributed with parameter 7. 

• A group (f) peer remains in the system for a seed 
dwell duration that is exponentially distributed with 
parameter 7. 

• A group (b) peer or group (f) peer spawns group (b) 

peers according to a Poisson process of rate ^/i and 
it spawns group (f) peers according to a Poisson 
process of rate /it. 

• The Poisson processes for spawning offspring, as 
well as the seed dwell durations, are mutually 
independent. 

The above assumptions uniquely determine the distribu- 
tion of the number of offspring, and therefore the total 



number of descendants, of a group (b) or group (f) peer. 
On average, a group (b) peer is in the system (as a group 
(b) peer) for (^iS^)^ + ^ time units, and thus on average 
a group (b) peer spawns ^{^fE^ + ^) offspring of type 
(b) and ^yf^ + ^ offspring in group (f). Similarly, on 
average, a peer in group (f) spawns ^ offspring of type 
(b) and ^ offspring in group (f). Let wt denote one plus 
the mean number of descendants of a group (b) peer and 
let m f denote one plus the mean number of descendants 
of a group (f) peer, in the ABS. Then by the theory of 
branching processes, (™^) is the minimum nonnegative 
solution to the equations 



ruf 



1 




The two-by-two matrix involved here has rank one, and 
the solution is easily found to be finite if 



K 



^ 7 



(6) 



If (6) holds, 

TOfe 



1 + e 
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and, in addition, the second moment of the number of 
descendants of a peer of either group (b) or (f) is finite 
and monotonically increasing in ^. Note that 

K \ 



nib 
ruf 



Next, we extend the scope of the ABS to include a 
gifted peer; this entails the following statistical assump- 
tions: 

• A gifted peer with piece collection C upon arrival 
is required to download K — \C\ pieces; usable 
opportunities for such downloads arrive according 
to a Poisson process of rate — ^). After the 
iiT — |C| downloads, the group (b) peer remains in 
the system as a seed peer for a seed dwell duration 
that is exponentially distributed with parameter 7. 

• While a gifted peer is in the system, it spawns group 
(b) peers according to a Poisson process of rate 
and it spawns group (f) peers according to a Poisson 
process of rate /i. 

• The Poisson processes for spawning offspring, as 
well as the seed dwell duration, are mutually inde- 
pendent. 

The mean time a gifted peer with initial piece collection 
C is in the system is thus ^^1*^! + ^ , so the mean total 
number of descendants of a gifted peer {not including 
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the gifted peer itself) is given by 

f K — \C\ 1 \ 

mg{C) = { ^^ _ + :^ ) (^^"^fa + ^"^/) 

= ('^+^) 

Note that mg{C) ^4" (a' - |C| + ^) 

Finally, we extend the scope of the A§S to include 
the processes of arrivals of gifted peers and the uploads 
of the fixed seed; this entails the following assumptions: 

• For each C with 1 e C, gifted peers with initial 
piece collection C arrive according to a Poisson 
process of rate Ac (as in the original model). 

• The fixed seed spawns peers in group (b) according 
to a Poisson process of rate SlJg and it spawns peers 
in group (f) according to a Poisson process of rate 

Us. 

• The Poisson processes of arrivals are mutually 
independent. 

• Gifted peers and offspring of the fixed seed are 
considered to be root peers. The evolution of the 
descendants of root peers are mutually independent. 

Let £>( denote the cumulative number of group (b) 
and group (f) peers appearing in the ABS up to time t. 

Lemma 2. The process {Dt '■ t > 0) is stochastically 
dominated (see the appendix for the definition) by {Dt : 
t > 0). 

Proof: We describe a particular method of coupling 
the ABS and the original system. By this, we mean a way 
to construct both processes on a single probability space. 
To do this, we start with the random variables governing 
the ABS, and describe how the original system (i.e. a 
system with the statistical description of the original 
system) can be overlaid on the same probabiUty space, 
in such a way that Dt < Dt for all t with probability 
one. 

The first step is to adopt a new way of thinking about 
the ABS. In the ABS, the sets of descendants of the 
root peers form a partition of all group (b) and group 
(f) peers in the ABS (for this purpose, the descendants 
of a root peer include the root peer itself if the root 
peer is an offspring of the fixed seed, but not if the 
root peer is a gifted peer). Imagine that each root peer 
arrives with a randomly generated script for itself and 
its descendants. The script includes the sample paths of 
the Poisson processes that determine: when pieces are 
to be downloaded, when group (b) peers are spawned, 
and when group (f) peers are spawned, as well as 
the seed dwell durations sampled from the exponential 
distribution with parameter 7. Whenever some peer in 



the ABS system spawns another, the portion of the script 
held by the parent associated with that offspring and its 
descendants becomes a script for that offspring. 

The next step is to build the original system using the 
same random variables, using the following assumptions. 
When thinning of Poisson processes is mentioned, it 
refers to randomly rejecting some points of a Poisson 
process to produce another point process with a specified 
intensity that is smaller than the rate of the Poisson 
process. 

1) The original system has independent Poisson ar- 
rivals of peers of type C at rate Ac for all C 
with 1 ^ C. These arrivals are not modeled in 
the ABS, and are to be generated for the original 
system independently of the ABS. 

2) The arrival processes of gifted peers of type C 
in the original system for all C with 1 S C are 
identical to those in the ABS. 

3) The point process of times that the seed uploads 
piece one to one-club peers is a thinning of the rate 
Us Poisson process governing creation of group (f) 
peers in the ABS system. 

4) The point process of times that the seed uploads 
piece one to normal young peers is a thinning of 
the rate ^Us Poisson process governing creation of 
group (b) peers in the ABS system. 

5) The point process of times that a peer in group 
(b), (g), or (f) uploads piece one to one-club peers 
is a thinning of the rate p Poisson process in the 
script of the peer for spawning group (f) peers. 

6) The point process of times that a peer in group 
(b), (g), or (f) uploads piece one to normal young 
peers is a thinning of the rate ^p Poisson process 
in the script of the peer for spawning group (b) 
peers. 

7) A peer in the original system in group (b) or (g) 
that does not have a complete collection, down- 
loads useful pieces from a peer in group (e) or 
(f) at the jump times of the rate — ^) Poisson 
process for downloads in its script. The peer can 
also make downloads at other times, to bring the 
total intensity of downloads from groups (e) and 
(f) up to at least ^^^^^^±^. 

8) The peer seed dwell time for any peer is specified 
in its script. 

A remark is in order about why the construction is 
possible. When one peer transfers a piece to another peer 
in the original system, it is considered an upload for the 
first peer and a download for the second. Thus, the timing 
of such transfers cannot be simultaneously governed by 
intemal scripts of the two peers. In the construction noted 
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here, such conflict does not occur, because the scripts are 
used to determine times that piece one can be uploaded, 
and the peers that are downloading piece one are in group 
(a) or (e), and are thus not yet following a script. And 
the scripts are used for downloading of pieces other than 
piece one, but do not constrain times that pieces other 
than piece one are uploaded. 

The resulting coupling satisfies the foUowing proper- 
ties. 

• Any peer in group (b), (f), or (g) in the original 
system is also in the ABS, in the same group and 
with the same time of arrival to that group. (Such 
peers can remain in the ABS longer than they stay 
in the original system.) 

• Any peer in group (f) in the original system (and 
thus also in the ABS) departs from both systems at 
the same time. (Peers in groups (b) or (g) in the 
original system can stay longer in the ABS than in 
the original system.) 

• Whenever some peer pi in the original system 
uploads piece one to some other peer p2, peer 
Pi simultaneously spawns peer p2 in the ABS. 
Afterwards, peer p2 is either in group (b) or in group 
(f) in both systems. 

• There can be more group (b) and more group 
(f) peers in the ABS than in the original system 
because the spawning rates in the ABS system are 
greater than in the original system, and group (b) 
and group (f) peers in the ABS can have fewer 
pieces in the ABS system than they have in the 
original system, and thus they can stay longer in 
the ABS system than in the original system. 

In particular, by the third point above, whenever piece 
one is uploaded in the original system a peer of group (b) 
or (f) is created in the ABS system. Therefore, Dt < Dt 
for all t > with probability one, which by the definition 
of stochastic domination, proves the lenuna. ■ 

Corollary 3. Given e>0,if^ is sufficiently small, then 
for all B sufficiently large, 

Dt<B + 



-t + et 



> > 0.9. (7) 



for allt>0 
Proof: By Lenmia 2, it suffices to prove Corollary 

3 with D replaced by D. Let D be a random process 
associated with the ABS, denoting the cumulative count- 
ing process that results if all the descendants of a root 
peer are counted at the time the root peer arrives. The, 

processes D and D count the same downloads of piece 

one, but D does so sooner, so Dt < Dt for all t. Thus, 



it suffices to grove Corollary 3 with D replaced by D. 

The process _D is a compound Poisson process, which 
can be decomposed into the sum of several independent 
compound Poisson processes: one for each type C with 
1 G C, and one for peer seeds generated directly by the 

fixed seed. The mean arrival rate for D satisfies: 
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and the batch sizes have finite second moments for ^ 
sufficiently small, and the second moments are increas- 
ing in ^. Therefore, Corollary 3 follows from Kingman's 
moment bound (see Proposition 20 in the appendix.) ■ 

Lemma 4. Given e > 0, if B is sufficiently large, 

P lAt> -B+ i Xc - e \ t.yt>o \ > 0.9. 
1 \CMC J J 

(8) 



Proof: The process ^ is a Poisson process with rate 
X^c fr^c ^c- Thus, (8) follows from Kingman's moment 
bound (see Proposition 20 in the appendix.) 



Lemma 5. The process Fj" -f- Y^'' + Yf is stochastically 
dominated by the number of customers in an M/GI/oo 
queueing system with initial state zero, arrival rate 
Xtotah and service times having mean + 

Proof: The idea of the proof is to show how, with a 
possible enlargement of the underlying probability space, 
an M/GI/oo system can be constructed on the same 
probability space as the original system, so that for any 
time t, Y/^ + Yj/ + Y.^ is less than or equal to the number 
of peers in the M/GI/oo system. Let the M/GI/oo 
system have the same arrival process as the original 
system-it is a Poisson process of rate Xtotai- 

An important point is that any peer in group (a), (b), or 
(g) is either receiving useful download opportunities at 
rate at least (1 — or is a peer seed (possible if it is in 
group (b) or (g)) and is thus waiting for a departure time 
that is exponentially distributed with parameter 7. We 
can thus imagine that any arriving peer has an internal 
Poisson clock that ticks at rate — ^) and an internal, 
exponentially distributed random variable with parameter 
7. Whenever its internal clock ticks, it can download a 
useful piece, until it either joins the one club (in which 



12 



case it leaves group (a) and joins group (e)) or it becomes 
a peer seed, in which case it remains in the system as 
a peer seed for an amount of time equal to its internal 
exponential random variable of parameter 7. 

An arriving peer in the original system may already 

have some pieces at the time of arrival, or its intensity 
of downloading pieces could be greater than {1 — 
or it might leave the union of groups (a), (b), and (g) by 
becoming a one -club peer These factors cause to reduce 
the time that a peer remains in the union of groups (a), 
(b) and (g). The M/GI/ 00 system system is constructed 
by ignoring those speedup factors. Specifically, in the 
M/GI/ 00 system, each arriving peer has to download 
K pieces at times governed by its internal Poisson clock, 
and then remain as a peer seed for a time duration 
given by its intemal exponentially distributed random 
variable for seed time. The service time distribution for 
the M /G//00 system is thus the sum of K independent 
exponential random variables with parameter — ^) 
plus a single exponential random variable with parameter 
7. Any peer that is in groups (a), (b), or (g) in the original 
system will be in the M/GI/ 00 system, and the mean 
service time for the M/ GI/ 00 system is ^^.^^^ + ^ . ■ 

Corollary 6. Given Co > and ^ > 0, if B is sufficiently 
large, 

PiY^" + + Yf <B + Cot for all i > 0} > 0.9. (9) 

Proof: The Corollary follows from Lemmas 5 and 
21 with m in Lemma 21 equal to + ^ and e equal 

to 60. ■ 

The proof of Theorem 1(a) is now completed. 

• Select e > so that 2e < Ajr_^iy. 

• Select ^ > so small that (7) holds for sufficientiy 
large B. 

• Select Co small enough that — — — k- < 

• Select B large enough that (7), (8), and (9) hold. 

• Select No large enough that < ^. 

Let £ be the intersection of the three events on the left 

sides of (7), (8), and (9). By the choices of the constants, 
(7), (8), and (9) hold, so that P{£} > 0.7. To complete 
the proof, it will be shown that f is a subset of the event 
in (5), thereby establishing (5). Since Nt is greater than 
or equal to the number of peers in the system that don't 
have piece one, on £, 

Nt>No + At-Dt>No-2B + (A^_{i} - 2e)t for 



all t > 0. Therefore, on £, for any t>0, 
Nt 

B + Cpt 

^ N0-2B + (A^_{i} - 2e)t 

< max< , )■ < z- 

\No-2B' A^_{i} -2eJ 

Thus, £ is a subset of the event in (5) as claimed. This 
completes the proof of Theorem 1(a). 

Vll. Proof of Positive Recurrence in Theorem 

1 

Theorem 1(b) is proved in this section. The first 
subsection treats the case < /u < 7 < cxd and the 
second subsection treats the case < 7 < /z. 

A. Proof of Positive Recurrence when < /U < 7 < 00 
in Theorem 1 

The detailed proof of Theorem 1(b) when < < 
7 < 00 is given in this subsection. Assume < < 
7 < 00 and assume (4) is valid for all 5 e C — {J"}. For 
any nonnegative function F = F(x) on the state space 
of the system, the drift of F at state x is defined as 

o(F)(x):= <ii^^^)\n^)-F{^)]- (10) 

If, as usual, the diagonal elements g(x, x) of the tran- 
sition matrix Q are chosen so that row sums are zero, 
Q{F) is the product of the matrix Q and function F, 
viewed as a vector. In this paper, we apply the following 
lennma imphed by the Foster-Lypunov criterion. 

Lemma 7. The P2P Markov process is positive recurrent 
and E[N] < +00, where N is a random variable with 
the stationary distribution for the number of peers in the 
system, if there is a nonnegative function VK(x) on the 
state space of the process, such that (i) {x : VF(x) < c} 
is a finite set for any constant c, and ( ii) there exists 
rio > and ^ > that QW < —^n < whenever 
n> Ug. We call such a W a valid Lyapunov function. 

Proof: For any x, g(x, x') is nonzero for only 
finitely many values of x', so QW is finite for 
all X. Therefore the constant B, defined by i? = 
maxx:n<no QW{yi), is finite. The lemma follows from 
the combined Foster-Lyapunov stabihty criterion and 
moment bound-see Proposition 1 8 in the appendix-with 
V = M^, /(x) =^n, and 5(x) = 5/|„<„j. ■ 
The Lyapunov fimction we use is, if0</x<7<oo: 

W:= ^'^'^C, (11) 

C:CeC 
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where 



Tc :-- 



1 j^i 

w 



and ifO</i<7 = oo: 
W := 



E 



.|C| 



Tc, 



(12) 



C:CeC-{^} 

with the following notation: 

. r e (0,i),a! e G (0,i),a G 

are positive constants to be specified, with r and 
f3 small, d large, and a close to one. 

. £c ■= {C : C" C C}, which is the collection 
of types of peers which are or can become type 
C peers. Note that £c is downward closed (i.e. a 
lower set) for any C. 

. -He ■■= {C : C G C,C' C}, which is the 
collection of types of peers which can help type C 
peers. Note that £c is upward closed (i.e. an upper 
set) for any C. Also, J" e 'He for any C e C - 7" 
and -Hjr = 0. 

• Ec := Ec":C'&£c ^C' = 



He :-- 



e.g. i/.^ = 0. 
• is the function with parameters d, (3, defined as 

{{2d + 2^ - x) if < x < 2d 
f(x-2d-i)2 if 2d< a; < 2(i+ i . 
if a; > 2d + i 

Thus (j)'{x) = -1 for < a; < 2d, (/)'(a;) = for 
a; > 2d + and increases linearly from —1 
to over the interval [2d, 2d + In particular, 
-1 < <j)'{x) < for a; > 0. 

In the proof, we consider the following two classes of 
states, where e is to be selected with < e < |. The 
classes overlap and their union includes every nonzero 
state: 

Deiuiition 1. Class I is the set of states x such that there 

exists S (z C — {J~^}, so that xs jn > 1 — e; class II is the 
set of states x such that there exist Ci^C^ € C, either 
C\ and C2 being distinct or both equal to T, so that, 
xc^/n > e/2^ and xc^/n > e/2^. 

The main idea of the proof is to show that W is 
a valid Lyapunov function for an appropriate choice 
of {r,d, /3,a,€). The given parameters of the network, 
K,Us,\ = (A5 : 5 € C) , 7 and /i, are treated as con- 
stants. Functions on the state space are considered which 
may depend on the variables r,d,P,a and e. It is 
convenient to adopt the big theta notation 6(*), with 



the understanding that it is uniform in these variables; 
this is sununarized in the following definitions. 

Definition 2. Given functions f and g on the state space, 
we say f = Q{g) if there exist constants ki,k2,no > 0, 
not depending on {r,d, I3,a,e), such that ki\g{x)\ < 
|/(x)| < k2\g{x)\ for all x such that n > Hq. 

For example, 2 = e(l),A,o,„,n = e(n), d ^ 6(1), 
d = 0(d). Similarly, we adopt notions of "small enough" 
and "large enough" that are uniform in (r, d, /3, a, e): 

Definition 3. The statement, "condition A is true if x > 
is small enough", means there exists a constant k > 0, 

not depending on {r,d, j3,a^e), such that A is true for 
any x G (0, k). Similarly, the statement, "condition A 
is true if x > is large enough", means there exists 
a constant k > 0, not depending on (r, d, j3, a, e), such 
that A is true for any x G (fc,oo). 

Some additional notation is applied in the following 
proofs: 

• := M+j. We have > vaax.x(t>{x) and 
> min{x : (p{x) = 0} + d > 1. 

. For any X,X' C C, 

^x,x' 'l2cexJ2c':C'ex'^c,C', where Tc,c' 
is defined in (1). 

• Dc is defined by 



Dc :-- 



J2i:teJ^^c,cu{i} ifCT^J" 

7a;^ if C = J^, 7 < 00 . 

ifC = J",7 = oo 



Except in the case C = T and 7 = 00, Dc is the 
aggregate rate that peers leave the group of type C 
peers. 

. For any X <^ C, Xx ~ Y.C:Cex^c, Dx := 
'^Ic-cex E>c, Dtotai ■= Dc, Xx := Y^c-Cex '^C, 
Kx' = Ec:Cex>^ciK-\C\+,ih'). 

Now we start to prove that W given by (56) or (12) is 
a vaUd Lyapunov function. The following proof applies 
if either 0<^<7<ooorO<;U<7 = oo, with 
differences being stated when necessary. 

To begin, we identify a simple approximation to the 
drift of W. Notice that Q{*) is Unear, so if < < 

7 < 00, 

QiW) = ^ ri<^iQ(rc), 

C:CeC 



where 

QiTc) 



\Q{El) + aQ{Ec<t>{Hc)) xiC^T 



\iC = T 
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IfO</i<7 = oo. 

Q{W) = J2 ' 



.\c\ 



Q{Tc). 



T and 7 < oo. Then, 



Define LW , an approximation of Q{yV), as follows: If 

< /X < 7 < 00, 



total 



D 



total J 



c-.cec 



where 



LTc := 



j EcQiEc) + aEcQ{<p{Hc)) if C ^ T 



Second, assume C 

\Q{Tc)-LTc\ = 

= \otal 

which implies (14) for C = T . There are only finitely 
many terms of Tc in W (2^ in total), and r < 1: Lemma 
8 follows. ■ 
Now we offer Lemma 9 and Lemma 10, both con- 
cerning upper bounds of LTc- They are appUed for the 
proof of Lemma 12. 



\nQ(n) 
IfO</x<7 = oo, 
LW : = 



r\^\LTc. (13) 

C:C<^c-{r} 



if C = J" Lemma 9. If d is large enough, Q{Ec) < 
e{l),Q{<j){Hc)) < M^e(l), LTc < M^e{Ec) < 
M^e{n) for any C € C. 



The following lenoma provides a bound on the approx- 
imation error: 

Lemma 8. \Q{W) - LW\ < M^Dt^t^i + 1)6(1). 

Proof: Compare Q{W) and LW term by term. 
Consider terms of the form Q{Tc) and LTc. First 
assume C ^ T. Because a < 1, we can write 



\Q'(Tc) - LTc\ < ai + a2 + 03, 



where 
ai 

012 
as 



-Q{El) - EcQ{E^ 



■c, 



^Sc + ^So,Hc ^ ^total + D total 

Q{Ec<P{Hc)) - 

QiEc)HHc) + EcQWHc)) 



\QiEcmHc)\ < Af<j(A£,, +r£,, 



The only way Ec and (j){Hc) can simultaneously change 
is that some peer with type in £c becomes a peer with 

type in He, causing Ec to decrease by 1, and (j){Hc) 
to decrease by at most ^jzrjnz-, so 



a2 < 



1 - /x/7 



From the discussion above and the fact that Ts^^-^^ < 
D total, we have 

\Q{Tc)-LTc\ < Af^e(l) + M^Aota/e(l) 

= M^(Aota( + l)e(l) (14) 

for every C eC- {F}. 



Proof: The upper bound for the drift of Ec is obvi- 
ous: Q{Ec) < Xsc < Xtotai- Next consider Q{(f){Hc))- 
Since Hj: = 0, we restrict our attention to the case 
C ^ Because is a decreasing function, only the 
rate for He to decrease contributes to the positive part 
in the drift of (j){Hc), so to consider an upper bound of 
Q{(f){Hc)) it satisfies to consider the rates of transitions 
that decrease He- There are two ways He can decrease: 
peers with one type in T-Lc becoming another type in T-Lc 

- with aggregate rate F^^^^^, and peer seeds departing 

- with rate Djr. Because the maximum that (p{Hc) can 
jump up is less than or equal to an upper bound 
for the drift of (l>{Hc) is 



Q{4>{Hc)) 



< 



< 



1 - fi/j 



[Us +Hc{tX + 7l{7<cx,})] 



1 -m/7 
= Q{l) + Hc@{l). 

We can choose d large enough, i.e. d > so 

Af0 > 2d+ 1/13 + Thus Q{(t){Hc)) vanishes 

when He > M^, because (j>{Hc) vanishes when He > 
2d+l/l3 and the jump size of He is bounded below by 
-l^^>-rf. Hence 

Q{(l>{Hc)) < 6(1) + M^e(l) e M^G(l), 

because > 1. 

Finally, the bound on LTc follows from the other two 
bounds already proved. Hence, Lemma 9 is proved. ■ 

Lemma 10. If d is large enough, 1— a, eM^, j3 are small 

enough and f3 ^ ^-j^/j ) < ^ — 1> for any 5 G C — {J-} 
and any nonzero state x such that xs/n> 1 — e, 



LTs < -AsEs. 



(15) 
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Remark 11. Recall that LTg = Es[Q{Es) + 

aQ{(j){Hs))], where the term EsQ{Es) can be 
traced back to the quadratic term \E'g of W, and 
aEsQ{4>{Hs)) can be traced back to the term 
aEsQ{(j>{Hs)) ofW. Before giving the proof of Lemma 
10, we describe why the term aE sQ{cj){H s)) is needed 
and how it helps LTs be negative. It has been discussed 
that the worst distribution of heavy load is when the 
heavy load aggregates in a type with only one missing 
piece. Consider the case \S\ — K — 1. Notice that 
EsQ(Es) = EsiXss - T^SsMs) and Te^.n, > Ds > 
— + Hqii \ ,^^7 1- Here we assume — > 1 — e. 5o 
^Es-'Hs increases almost proportionally to Hs. When 
Hs is larger than d for d sufficiently large, F^g^^g is 
larger than Afg, so EsQ{Es) is negative and is bounded 
above by —Q{Es) = —0{n). But when Hs is smaller 
than d, Tg^ -^i^ can be smaller than A^^,, so EsQ{Es) 
is positive and is lower bounded by Q{Es) — 0(n), 
which has the wrong sign. The term aEsQ{4>{Hs)) is 
chosen so that aQ{(j}{Hs)) balances out the coefficient 
Xss — ^£s,Hs when Hs is small, so that LTs still 
negative and upper bounded by —@{Es). 

The definition of Hs implies that, when xs is close 
to n, Hs is the mean number of type S peers that will 
be helped by the helping peers, which are the ones in 
Us- (By saying a peer is helped, we mean a piece is 
uploaded to the peer). In other words, Hs is the stored 
potential for helping type S peers. As type S peers are 
helped by the helping peers, the potential decreases, with 
the magnitude of decrease equal to the number of type S 
peers which are helped. So if we only consider the piece 
transmissions involving one peer of type S and one peer 
of type in T-Ls, the downward drift of Hs has magnitude 
less than or equal to the downward drift of Es. If 
we only consider the external arrivals and the uploads 
from the fixed seed, the terms in the drift of Hs are 
[Lc:CeHs^^ + Uslih], and 

the terms in the drift of Es are Agg — Us; the former is 
larger than the latter precisely because of (4). Finally, 
Hs has a bit more downward drift due to peers other 
than type S peers uploading to peers in "Hg, but that is 
small for e sufficiently small. Combining the downward 
and the other drifts, we see that the drift of Hs is 
approximately the same as the drift of Es, with the 
drift of Hs a little greater. The difference of the two 
drifts is As, defined in (4). Also, when Hs is small, the 
function (j) at Hs has derivative —1. Thus the coefficient 
of Es in LTs, which is Q{Es)+aQ{(j){Hs)), is negative 
because a is close to 1, so LTs upper bounded by 
-Q{Es) = -e(n). 

In summary, the above explains the reason we in- 



cluded the term Es(t>{Hs) in the Lyapunov function; it 
balances out the positive drift of when Hs is small. 

Proof: Now the detailed proof of Lemma 10 is 
given. Consider a nonzero state x of type I, with 
S gC- {J"}, Xs /« > 1 - e. Recall that (4) is assumed 
to hold; As < 0. We begin with three observations. 
First, consider a lower bound for Q{Hs): 

^ J2 {K-\C'\+n/j)Q{xc') 



Q{Hs) 



> 



1- n/j 
1 

1- n/j 
1 



C':C'eHs 



Kis+Ds'-_ 



F 



7 



'Hs,ns 



D: 



l-/i/7 

where bi := Ds — T-^^^-fig — xj^p,. In view of 

Ds > {l-e)iU, + xnslJ') 

> U, + xnsf^-e[e{l) + xns&W] (17) 

and 

^ns,ns +^^"1^ < eUs + x-Hsli, (18) 
it follows that 

h>U,-e[e{l) + xnsm)]- (19) 
Combining (16) with (19), yields: 



7. 

(16) 



Q{Hs) > -hi - e[e{l) + xns&m, 
hi := Ds - ^^-j- {X*ns + Us) . 



1 



Second, 

Ds > x-Hsl^i^ - e) 



m/7 

xn,Ml) = Hse{l) 



(20) 
(21) 

(22) 



because xhs < Hs < ^^j^j^ x-Us and e < i. 

Third, substituting (22) mto (21) yields that if d is 
sufficiently large, then hi > dQ{l) — 0(1) whenever 
Hs > d. Therefore, if d is sufficiently large. 



hi > {) whenever Hs > d. 



(23) 



The remainder of the proof is divided into two, 
according to the value of Hs- 

• Hs < M^: Under this condition, x-Ug < and 
Mrf, > 1, so (20) imphes: 



QiHs 



> 



-/ii - eM^e(l). 



(24) 



Because 4)' exists and is Lipschitz continuous with 
Lipschitz constant (3, and because the magnitudes 
of the jumps of Hs are boimded by ^^^j^, Lemma 
19 yields 

Q{^{Hs)) < cj>\Hs)Q{Hs) + b2 (25) 



16 



where 

(A«s + Tgg^us + ^HsMs + -^^m)- (26) 

Upper bounds for the terms in the right hand side of 
(25) are found next. First, a bound for 62 is found. 
By (17) and (18), 

< Ds + eM^e{iy, (27) 

< Ds + eM^e(l). (28) 

Substituting (27) and (28) into the right side of (26), 
yields 

62 < /3e(l) + ^eM^e(l) + 

/if + ^ 



1_ M 

7 



(29) 



< 



1 i?s + /3e(i), 



(30) 



where to obtain (30) from (29), we assume 

^(S7f)'<^-landeM,<l. 

Second, a bound for ^'{Hs)Q{Hs) is found. Tak- 
ing into account that —1 < 0' < 0, multiply 
both sides of (24) by 4>'{Hs) and use the fact 
(l,'{Hs) = -l for < d, and (23), to obtain: 

cj>'{Hs)Q{Hs) < -<P'{Hs)hi + eM^Q{l) 

< hi+eM^Q{l). (31) 

Substituting (21), (30) and (31) into (25) yields 



+eM^e(l)+/3e(l). 



(32) 



We obtain a bound on Q{Es) + aQ{(j){Hs)), the 
coefficient of Es in LTs, using (32) and the facts 
Q{Es) < — Ds and a < 1, as follows: 

Q{Es) + aQ{<t>{Hs)) 

+ eM^e(l) + /3e(l) 
< As + (l-a)e(l) 

+ eM^e(l)+/3e(l). (33) 

Because A 5 < 0, if 1— a, eM^, (3 are close to 0, the 
last three terms in (33) can be made small compared 
to |As|, so Q{Es) + aQ{<f>{Hs)) < ^Ag, which 



implies (15). 

Hs > M^: To take care of this case, assume 



d > 



Hence 



Q{(j}{Hs)) vanishes for Hs in this range. By (22), 



Q{Es) + aQ{cj>{Hs)) 



< 
< 

< 



6(1) - M^G(l) 



1 



Ac 



if d is large enough so that is large enough.. 
Therefore (15) holds. 

The proof of Lemma 10 is complete. ■ 
Lemmas 9 and 10 will be used to prove the following 
lemma. 



Lemma 12. If d 

K+iihV / 1 
1-^/7 ' 



is large enough, (1 — 
are small enough, and 



I, 



(a) On class I, LW < -r^e(n); 

(b) On class II, LW < -r^e^O^n^) + M^e(n). 

Proof: First consider Lemma 12(a). Since there are 
only finitely many types, we can fix a set 5 G C — {J"} 
and consider the set of class I states x for which xs/n > 
1 — e. Since e € (0, 5), Es > \n. By assumption in this 
section, A5 < 0. By Lemma 10, 



LTs < ^Agn e -e(n). 



(34) 



For type C with |C| > jS"], Lemma 9 and (34) imply 

r^^^LTc < rM^rl^le(n) < 2-^-'^r^^\\LTs\. (35) 

if rM^ is chosen to be small enough. 

For type C with |C| < \S\ but C S, Ec < en; 
Lemma 9 and (34) imply 

r^^^LTc 



< 



if eM^r~^ is chosen to be small enough. 
Equations (35) and (36) imply that 



(36) 



LW 



LTs 



E • 

C:\C\>\S\ 



.\C\ 



LTc + 



< 



< 



.\s\ 



C:\C\<\S\,C^S 

LTs+lr\^^\LTs\ 



LTc 



^rl'^lAsn< -r^e(n), 



which proves Lemma 12(a). 

Next consider Lemma 12(b). First, suppose Ci ^ C2 
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and consider the set of class n states x such that if eM^ 
xc-^jn > r],xc2/n > where r] = For such 

states: 

r^co-Wc, > > — xciAi > m^n G e^e(n). (37) 
Since > > (37) implies 

< ~e^e{n^) + e{n). (38) 

Lemma 9 indicates that Ec2Q{<l^{Hc2)) < M^9(n), so 
(38) imphes 

LTc2 = Ec2Q{Ec2) + aEc2Q{4>{Hc2)) 

< -e^e(n^) + M0e(n). (39) 

Second, consider the set of class II states x such that 
xj^/n > rj, where 77 = . If 7 = 00, this set is 
empty, so suppose 7 < 00. For such states, 

LTj^ = nQ{n) = n{Xtotai - ixj:) 
< n{Xtotai - nin) 
= -ee(n^) + e(n). (40) 



is small enough. 



Recall that Lemma 9 implies for any C, LTc < 
M^Q(n). Therefore, for either condition Ci % C2 or 
Ci=C2= (39) and (40) imply that, over the set of 
aU class n states, 

LW < r^^^^LTc2 + J2 ^^-^ 

C:CjtC2 

which proves Lemma 12(b). ■ 

With Lemmas 8 and 12, Theorem 1(b) in the case 

< /i < 7 < 00 can be proved: 

Proof of Theorem 1(b): On class I, 



D 



total 



C:C^S 

< Us+xhsIJ'+ E —{Us+nii) 

C:C^S ^ 

< 2{U^ + enii) = e{l)+€e{n). 
So Lemma 8 implies that on class I, 

\Q{W) - LW\ < eM^e(n) + M<^e(l). 

Combining with Lemma 12(a), implies that under the 
conditions of Lemma 12, on class I, 

Q{W) < LW + \Q{W) - LW\ 

< -r^e(n) + eM^e(n) + M^e(l) 

e -r-^e(n) + M^e(l). (41) 



On class II, Df^otai ^ Ug + n/j, = G(n), so Lemma 8 
implies that 

\Q{W) - LW\ < M^Q{n). 

Combining with Lemma 12(b), implies that under the 
conditions of Lemma 12, on class II, 



Q{W) 



< 
< 



LW +\Q{W)- LW\ 



(42) 



Equations (41) and (42) imply that if {r,d,j3,a,e) 
satisfies the conditions of Lemma 12, there exists ^ > 
sufficiently small such that Q{W) < —^n for all n larger 
than some constant. For such ^ and such (r, d, (3, a), W 
is a valid Lyapunov function, so by Lemma 7, Theorem 
1(b) for the case 0</Lt<7<(X)is proved. ■ 



B. Proof of Positive Recurrence when Q < ^ < in 

Theorem 1 

Now we consider the case when < 7 < /x. Assume 

Us+Y.c:kec > for all A; e 7". Then Us+X^c > ^ 
for any C G C — {J^}. Consider a Lyapunov function of 
the following form: 



(43) 



where 



c-.c&c 

^-E^+pEc<f>{H'c) iiCi^T 
if C = 7" ' 



H'q := J2c'-C'e'Hc (^+1~I^'I)^C") and p is a constant 
(i.e. p = 6(1)) such that 




Xs^. - p{Us + X*^J < 0, VC e C - {J-}. 



(44) 



The variable a is not used in this section, so the big 
notation is uniform in (r, /3, d,,e). 
Define LW, as follows: 



LW':= ^ rl^lLT^, 

C:CeC 



(45) 



where 



LTh:= 



EcQ{Ec)+pEcQ{<k{H'c)) ifC^T 
nQ{n) \iC = T 



Lemmas 8 and 9 can be verified as before, with He, 
W, LW, and LTc replaced by H'^, W', LW', and LT^, 
respectively. The following lemma similar to Lemma 10 
can be established: 
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Lemma 13. If d is large enough, eM^,f3 are small 
enough, for any S & C — {T} such that xs/n> I — e, 

LT's<l[X£,-p{U, + X*^jEs. (46) 

Proof: Suppose S & C — {J^} and Xs/n > 1 — e, 
e e (0, |), one lower bound for Q{H'g) is: 

Q{H's) > \*^^+Ds-Tns,ns-x^l (47) 

where bi was introduced in (16), because 7 < /i. 
Substituting (19) into (48) yields 

Q{H's) > X*^^ + Us- e[e(l) + xns^i^)]- (49) 
Consider two conditions of H'g-. 

• Hg < M^: Under this condition, x-^g < and 
> 1, so (49) becomes: 

QiH's) >Us + X*ns - eM^e(l). (50) 

Because cj)' exists and is Lipschitz continuous with 
Lipschitz constant j3, and because the magnitude of 
the jump of H'g is bounded by K + 1, by Lemma 

19, 

Q{(l>{H's)) < (l>'{H's)Q{H's) + h'^ (51) 

where 

Consider the term b'2. By (27) and (28), and assum- 
ing eM^ < 1, we have 

6'2 < ^e(l) + ^eM^e(l) + /3£)se(l) 
< /3Dse(l) + /3e(l). (52) 

If ^ is small enough, ^£»s6(l) < ^Ds, so (52) 
becomes: 

b'2<^Ds + m^). (53) 

Substituting (50) and (53) into (51), and applying 

Q{Es) < Xss - Ds, we can bound Q{Es) + 
pQ{4}{H'g)), the coefficient of Es in LT'g, as fol- 
lows: 

Q{Es)+pQ{4>{H's)) 
< X£,-\Ds+p<t>'{H's){Us + X*u^) + 

^e(i) + eM^e(i) 

= ^ss-p{Us + y^^) + b'^ + 

^e(l) + eM^e(l), (54) 



where 

6'3 := p{l + cl^'[H's)){Us + X*n^)-\Ds 

^ {-\Ds if H's<d 

~ \q{1) - rfe(l) if H's>d' 

because Ds > {1 - €)x-HsIJ' > 2(k+i) ^sI^ = 
<d{H'g). Hence if d is large enough, 63 < 0. If 
15, eM^ are close to 0, the last two terms in (54) 
can be neglected. Thus, Q{Es) + pQ{(j){H'g)) < 
^[Afg — p{Us + A^g)], which implies (46). 
• Hg > M^: Under this condition, choose d such 
thatd> K+1, so > 2d+l/l3 + K+l.¥lence 
Q{(j){H'g)) vanishes for Hg in this range. The fact 
that Ds = e{H's) yields that 

QiEs)+pQ{<f>iH's)) 

< Xe,-Ds 

< e(l)-M^e(l) 

< li^ss - PiUs + X*^J], 

if d is large enough, and hence also M^. Therefore 
(46) holds. 

So far. Lemma 13 is proved. ■ 
With Lemma 8, 9 and 13, Lemma 12 with LW 
replaced by LW can be easily verified to be valid. 
Thereby Theorem 1(b) at condition < 7 < /x is proved. 

Vlll. Extensions 
A. General Piece Selection Policies 

A piece selection policy is used to choose which piece 
is transfered whenever one peer or the fixed seed is to 
upload a piece to a chosen peer. The random useful 
piece selection policy is assumed in Theorem 1, but 
the theorem can be extended to a large class of piece 
selection policies. Such extension was noted in [10] for 
the less general model of that paper. Essentially the only 
restriction needed is that if the uploading peer or fixed 
seed has a useful piece for the downloading peer, then 
a useful piece must be transferred. This restriction is 
similar to a work conserving restriction in the theory 
of service systems. In particular. Theorem 1 extends 
to cover a broad class of rarest first piece selection 
policies. Peers can estimate which pieces are more rare 
in a distributed way, by exchanging information with the 
peers they contact. Even more general policies would 
allow the piece selection to depend in an arbitrary way 
on the piece collections of all peers. 

To be specific, consider the following family "H of 
piece selection policies. Each policy in 7i corresponds 
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to a mapping h from C x (C U {J"}) x 5 to the set of 
probability distributions on satisfying the usefuhiess 
constraint: 

hi{A,B,x.) = l whenever B ^ A, 

ieB-A 

with the following meaning of h: 

• When a type A peer is to download a piece from a 
type -B peer and the state of the entire network is 
X, piece i is selected with probability hi{A,B,x), 
for i £ T. 

• When a type A peer is to download a piece from 
the fixed seed and the state of the entire network is 
X, piece i is selected with probabihty hi{A,J^,x), 
for i eJ^. 

Theorem 1 can be extended to piece selection policies 

in H. One minor change is needed, because the Markov 
process may not be irreducible for some piece selection 
policies. In general, the set of aU states that are reachable 
from the empty state is the unique minimal closed set of 
states, and the process restricted to that set of states is 
irreducible. For example, if the lowest numbered useful 
piece is selected at each download opportunity, then the 
minimal closed set of states consists of the states such 
that each peer has either no pieces or a consecutively 
numbered set of pieces beginning with the first piece. 
See [10] for further discussion. We state the result as a 
theorem. 

Theorem 14. (Stability conditions for general useful 

piece selection policies) Consider the network model 
of Section III, except with the random piece selection 
policy replaced by a policy h in H. (a) If either of the 
two conditions in Theorem 1(a) hold then the Markov 
process is transient, and the number of peers in the 
system converges to infinity with probability one. (ii) If 
either of the two conditions in Theorem 1(b) hold the 
Markov process restricted to the closed set of states is 
positive recurrent, the mean time to reach the empty state 
from any initial state has finite mean, and the equilibrium 
distribution n is such that ''^C-''-)!-''-! ^ 

Thus, with the possible exception of the borderline 
case, rarest first piece selection does not increase the 
region of stability. 

B. Network Coding 

Network coding, introduced by Ahlswede, Cai, and 
Yeung, [21], can be naturally incorporated into P2P 
distribution networks, as noted in [7]. The related work 
[22] considers aU to all exchange of pieces among a 
fixed population of peers through random contacts and 



network coding. The method can be described as follows. 
The file to be transmitted is divided into K data pieces, 
mi , TO2 , . . . , rriK ■ The data pieces are taken to be vectors 
of some fixed length r over a finite field ¥g with q 
elements, where q is some power of a prime number. If 
the piece size is M bits, this can be done by viewing each 
message as an r = [M/log2(g')] dimensional vector 
over ¥g. Any coded piece e is a linear combination of 
the original K data pieces: e = J2f=i ^i^^i; the vector of 
coefficients {6i, . . . , 6^) is called the coding vector of 
the coded piece; the coding vector is included whenever 
a coded piece is sent. Suppose the fixed seed uploads 
coded pieces to peers, and peers exchange coded pieces. 
In this context, the type of a peer A is the subspace Va 
of spanned by the coding vectors of the coded pieces 
it has received. Once the dimension of Va reaches K, 
peer A can recover the original message. Let V denote 
the set of all subspaces of F^, so V is the set of possible 
types. 

When peer A contacts peer B, suppose peer B sends 
peer A a random linear combination of its coded pieces, 
where the coefficients are independent and uniformly 
distributed over F^. Equivalently, the coding vector of 
the coded piece sent from B is uniformly distributed 
over Vb- The coded piece is considered useful to A 
if adding it to ^'s collection of coded pieces increases 
the dimension of Va- Equivalently, the piece from B is 
useful to A if its coding vector is not in the subspace 
n Vb. The probabihty the piece is useful to A is 
therefore given by 

F{piece from B is useful to A} 

\VAf}VB\ 

\Vb\ 

_ qdim{VAr[VB)-diva{VB) 

If peer B can possibly help peer A, meaning Vb 't 
Va (true, for example, if dim{VB) > dim{VA)), the 
probability that a random coded piece from B is helpful 
to A is greater than or equal to 1 — Similarly, the 
probability a random coded piece from the seed is useful 
to any peer A with dimiVA) < K — 1 is also greater 
than or equal to 1 — ^ . 

The network state x specifies the number of peers 
in the network of each type. There are only finitely 
many types, so the overall state space is still countably 
infinite. Moreover, the Markov process is easily seen 
to be irreducible. A proof of the following variation of 
Theorem 1 is summarized below. Let ju = {l — ^ fi. 

Theorem 15. (Stability conditions for a network coding 
based system) Suppose random linear network coding 
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with vectors over ¥^ is used, with random peer contacts 
and parameters K, q, {Xy : V £ V), Us, 7, and fi. 
Suppose Apif = i/ 7 = 00, and Xtotai > 0. 

(a) The Markov process is transient if either of the 
following two conditions is true: 

• 0</U<7<cxD and for some V~ € V with 
dim{V-) = K-l, 

Us + Ev^v- MK - dimjV) + 1) _ 

^total 

7 

• < J < n. Us = 0, and {V gV : Xy > 0} does 

not span ¥^ . 

(b) The process is positive recurrent and E[n] < 00 in 
equilibrium, if either of the following two conditions is 
true: 

• 0<pt<7<oo and for any V~ G V with 

dim{V-) =K-1, 

Xtotai < 

Us+ ^v(k- dim{V) + 

• < 7 < /I and either Us > or {V € V : Xy > 0} 
spans ¥^ . 

The gap between the necessary and sufficient condi- 
tions in Theorem 15 can be made arbitrarily small by 
taking q large enough. 

For the case that peers arrive with pieces, network 
coding is quite effective at reducing the impact of the 
missing piece syndrome. For example, suppose peers 
with no pieces arrive at rate Aq and peers with one 
piece arrive at rate Ai, where the coding vector for the 
piece given to a peer at time of arrival is uniformly 
distributed over all q^ possibilities. (So with probability 
q~^ the coding vector is the all zero vector and the 
piece is useless.) Suppose there are no other arrivals, 
that Us = 0, and 7 = 00. Thus, the total arrival rate 
is Xtotai = Xd + Ai, and the fraction of peers arriving 
with one (possibly useless) piece is / = j^^^- Then 
Theorem 15 yields that the Markov process is transient 
if / < (^gJi)K positive recurrent if / > (^^^^27^ • 
For example, if g = 64 and K = 200, the Markov 
process is transient if / < = 0.00507 and positive 
recurrent if / > = 0.00516. In contrast, without 

network coding and a fraction / of peers arriving with 
one uniformly randomly selected data piece, Theorem 1 
implies the network is transient for any / < 1. 



We comment briefly on how the proof of Theorem 
1 can be modified to yield Theorem 15. First, consider 
how the proof of Theorem 1(a) can be modified to prove 
Theorem 15(a). Consider the main case, < /j. < -f < 
00. Let y ~ be the subspace of with dimension K—1 
appearing in part (a). To incorporate network coding, 
the partition of peers described in Section V should be 
replaced by the following partition: 

• Normal young peer: A normal young peer is a peer 
A such that Va is a proper subset of V~ . 

• Infected peer. An infected peer is a peer B that was 
a normal young peer when it first arrived, but at the 
current time, Vb (t V~ ■ 

• Gifted peer. A gifted peer is a peer G such that at 
the time of its arrival, Vq <Ji ^~ ■ 

• One-club peer: A peer of type 'V~ . 

• Former one-club peer: A former one-club peer is a 
peer in the system that is not a one-club peer but 
at some earlier time was a one-club peer. 

For any No > 1, it is possible to reach the state with No 
one-club peers and no other peers in the network. 

Call a peer A enlightened ii Va <^ V~. Note that 
gifted peers are enlightened when they arrive, and every 
other peer must become enlightened before departing. 
A peer becoming enlightened with network coding is 
analogous to a peer downloading the missing piece 
without network coding. In particular, for the proof of 
Theorem 15, the process Dt should be the cumulative 
number of downloads causing the recipient peers to 
become enlightened. 

The same autonomous branching system (ABS) can 
be used as in the proof of Theorem 1. Lemma 2 remains 
true, but the coupling argument used to prove it becomes 
more subtle. The issue is that the rate that a group (b) 
or (g) peer downloads pieces can be less than /i(l — ^), 
because random linear combinations are sent that are not 
always useful. This effect causes the group (b) and group 
(g) peers to remain in the system longer, so that they can 
continue to upload useful pieces to one club peers for 
longer. However, note that if A is a group (b) or (g) peer 
that is not a peer seed, and B is a one-club peer, then 
the probability a random piece from A is useful to B 
is less than or equal to the probability a random piece 
from B is useful to A. Therefore, if the internal clocks 
of the group (b) and (g) peers are slowed down so that 
their download rate of useful pieces matches that of the 
original system, then their upload rate of useful pieces 
to the one club peers wiU still be at least as large as in 
the original system. 

The other parts of the proof of Theorem 1(a) readily 
carry over to imply Theorem 15(a). 
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Next, the modifications of Theorem 1(b) needed to 
prove Theorem 15(b) are described. The same approach 
works with the same form of Lyapunov function, except 
V is used as the set of types instead of C. In places 
the cardinahty |C| of a type C is used in the proof of 
Theorem 1(b), the dimension dim{V) of a type V is used 
in the proof of Theorem 15(b). In some of the places 
that fj, is used in the proof of Theorem 1(b), it should be 
replaced by Ji. 

The condition (55) holding for all E V is equiv- 
alent to the following: for any 5 e V - F^, A5 < 0, 
where 



A< 



V:VGS,VeV 




dim{V) 



The condition A5 < means that the rate of arrival of 
peers that can become type S peers is less than a lower 
bound on the long term rate that peers of type S receive 
useful pieces. The particular Lyapunov function we use 
in case < ^ < 7 < 00, is: 



W:= J2 r*'"(^)Ty, 



(56) 



V:VGV 



where 



Tv := 



'\E^y+aEv(l>{Hv) ifF^Ff 



with a, r, d, /3, and d and the function (j) as in the proof 
of Theorem 1(b), and 

. £y := {V' : V C V}, which is the collection of 
types of peers which are or can become type V 

peers. 

. Uv := {V : V e V,V' g V}, which is the 
collection of types of peers which can help type V 
peers. Notice that Ff e ^y for any F e V - Ff 



and HfK = 0. 



Xv 



The choice of Hy here is motivated by Remark 11. 
The proof that this Lyapunov function works for proving 
Theorem 15(b) parallels the proof of Theorem 1(b). 

Remark 16. When network coding is considered, it is 
typically assumed that peers do not exchange descrip- 



tions of the pieces they already have. This is likely 
because such descriptions are more complex than simple 
bit vectors indicating data pieces used without network 
coding, and because network coding works quite well 
even without such exchange. If exchange of information 
were used, then any time a peer A with subspace Va 
transfers a piece to a peer B with subspace Vb such that 
Va 't- ^B, a useful transfer could be achieved. Theorem 
15 remains true under this mode of operation if'jl = ii 
and g — > 00 is taken in part (b), and the gap between 
parts (a) and (b) shrinks to zero. 

C. Modeling faster recovery for unsuccessful contacts 

One aspect of the model is that the time between 
upload attempts by a peer or a seed do not depend 
on whether the attempts are successful. In practice, it 
can be expected that if an attempt is not successful 
because there is no useful piece to transfer, then the 
time to the next attempt can be reduced, perhaps by 
some constant factor r] > 1, such as rj = 10. We 
discuss briefly how this might be addressed in the model 
of this paper, and the implications. The model of this 
paper is push oriented, in that the times that peers and 
the fixed seed attempt uploads are generated by their 
intemal Poisson clocks. If we assume that each of those 
clocks runs faster by a factor r] until the next clock tick, 
whenever there is no useful piece to upload, then two 
things happen when there is a large one club. First, the 
rate of download opportunities for a young, gifted, or 
infected peer increases by a factor close to 77, which 
is probably a violation of an implied soft download 
constraint in our model. Secondly, this would worsen 
the missing piece syndrome if some of the peers arrive 
with pieces at time of arrival (i.e. if there are gifted 
peers) because those peers would be uploading piece one 
a factor 77 more slowly than they would be downloading 
other pieces. Their contribution to the upload rate of 
piece one before they become peer seeds would thus be 
reduced by a factor 77. 

For the original model, it would be mathematically 
equivalent for peer-to-peer contacts to be modeled as 
pulls, with a peer randomly contacting another peer to 
download from at the times of an internal rate fj, Poisson 
clock (while the fixed seed would still push pieces). If 
each peer attempting to download a piece would run its 
clock faster by a factor -q > 1 following an unsuccessful 
attempt, until the next attempt, then again two things 
happen when there is a large one club. First, the rate 
of upload opportunities for a young, gifted, or infected 
peer increases by a factor close to 77, which is probably 
a violation of the impUed soft upload constraint in our 
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model. Secondly, this would lessen the missing piece 
syndrome if some of the peers arrive with pieces at time 
of arrival (i.e. if there are gifted peers) because those 
peers would be uploading piece one a factor rj more 
quickly than they would be downloading other pieces. 
Their contribution to the upload rate of piece one before 
they become peer seeds would thus be increased by a 
factor 7]. 

Either of the above two approaches leads to a vi- 
olation of our implicit assumption that peers upload 
and download at the same rates. Further, if no peers 
arrive with pieces (so there are no gifted peers) the 
stabiUty condition wouldn't change anyway. A third 
approach would be to consider a push or puU scenario, 
but exphcitly limit upload and download rates at a peer to 
be equal when they occur simultaneously. Specifically, if 
some peers arrive with pieces, so there exist gifted peers, 
then in the original model those gifted peers tend to 
have successful uploads-they have the rare piece one to 
give to other peers-and successful downloads-they need 
pieces other than piece one that most of the other peers 
have. Having those gifted peers upload and download 
at equal rates would preserve the balance impUcit in 
our original model, and the condition for stability would 
remain unchanged. 

D. The Borderline of StabiUty 

Theorem 1 provides a sufficient condition for stability 
and a matching sufficient condition for instabiUty, but it 
leaves open the borderline case: namely, when equality 
holds in (3) (or, equivalently, (2)) for one or more values 
of fc e J" and the strict inequality (3) holds for all other 
k. While it may not be interesting from a practical point 
of view, we comment on the borderline case. First, we 
give a precise result for a limiting case of the original 
system, and then we offer a conjecture. As in [10], 
a simpler network model results by taking a limit as 
/i — )• oo. Call a state slow if all peers in the system 
have the same type, which includes the state such that 
there are no peers in the system. Otherwise, call a state 
fast. The total rate of transition out of any slow state 
does not depend on /i, and the total rate out of any 
fast state is bounded below by a positive constant times 
fjb. For very large values of /x, the process spends most 
of its time in slow states. The original Markov process 
can be transformed into a new one by watching the 
original process while it is in the set of slow states. 
This means removing the portions of each sample path 
during which the process is in fast states, and time- 
shifting the remaining parts of the sample path to leave 
no gaps in time. The limiting Markov process, which we 




Fig. 3. Transition rates of the /U = oo variation of Example 3 with 
\i = \ for all i. 



call the /z = 00 process, is the weak Umit (defined as 
usual for probability measures on the space of cadlag 
sample paths equipped with the Skorohod topology) of 
the original process watched in the set of slow states, as 
/i oo. If 7 is fixed as /i oo the model becomes 
degenerate, because a single peer seed would qiuckly 
convert all other peers into peer seeds. If 7 = ^/z for 
fixed 9 and /j, 00 then the ^ ~ 00 model is more 
interesting but somewhat complicated. So we consider 
7 = 00 for simpUcity. For further simpUcity we consider 
networks of the form in Example 3 (for any K > 2) for 
symmetric arrival rates. Thus Ac = A if |C| = 1 and 
Ac = otherwise. Also, Us = (no fixed seed) and 
7 = 00. Note that these networks are borderline cases, 
not covered by Theorem 1. 

By symmetry of the model, the state space of the /U = 
00 process can be reduced to S = {(0, 0)} U {(n, k) : 
n> 1,1 < k < K — 1}, where a state (n, k) corresponds 
to n peers in the system which all possess the same set 
of k pieces. State (0,0) is transient. The transition rate 
diagram is pictured in Figure 3 for /sT = 3. States of the 
form (n, K— 1) form the top layer of states, and are those 
for which all peers have the same set of K — 1 pieces. 
The transitions out of such a state {n,K—l) is described 
as follows. There is a transition to state {n + l,K — 1) 
with rate (K— 1)A, corresponding to the arrival of a new 
peer possessing one of the K — 1 pieces that the other 
peers already have; the new peer instantly obtains all of 
the other K—1 pieces from the other peers. At rate A a 
new peer arrives with the piece missing by all the other 
peers. The new peer downloads and uploads at equal 
rates, until it either obtains all the K—1 other pieces, or 
until all the other peers have departed. By the nature of 
Poisson processes, the probability distribution of the next 
state can be described in terms of flips of a fair coin, with 
"heads" denoting an upload by the new peer and "tails" 
denoting a dowrdoad by the new peer. Let Z denote the 
number of "heads" in an experiment of repeated coin 
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flips, when a fair coin is flipped until "tails" is observed 
K —1 times. Then Z represents the potential number of 
peers already in the system that can leave due to uploads 
from the new peer. If Z < n — 1, then the next state is 
{n — Z, K — 1). If Z > n then the new state will have 
the form (1, j) with 1 < j < K — 1, corresponding to 
the case that all peers that were originally in the system 
depart, and the new peer remains. (The distribution over 
j can be computed easily but is not important.) Note 
that E[Z] = - 1, so the rate {K - 1)A of upward 
unit jumps is equal to the mean rate A£'[.^] of decrease 
due to downward jumps (ignoring the lower boundary). 
Thus, when the process is in the top layer of states, it 
evolves as a stationary, independent increment process 
with zero drift. Such processes are null-recurrent, and 
therefore, the fi = oo process is null-recurrent. 

In essence, the /i = oo process is simple because 
peers remain young for only an instant; there are no 
infections of young peers by gifted peers. If fi is finite, 
such infections effectively increase the departure rate, 
by roughly a constant divided by the number of peers in 
the system. The constant is decreasing in fx. A reflecting 
Brownian motion with negative drift inversely propor- 
tional to the state is positive recurrent if the constant of 
proportionality is sufficiently large, and is null recurrent 
otherwise. So the use of a diffusion approximation leads 
us to pose the following conjecture, which pertains to 
the symmetric flat-network model considered in [11]: 

Conjecture 17. Let K > 1 and suppose Xq — \ for 
\C\ = 1 and Ac = otherwise. For some uk > 0, the 
process is positive recurrent if < /x/A < ax and is 
null recurrent if fi/X > ax- 

IX. Conclusion 

By focusing on the missing piece syndrome, which 
affects the performance of a P2P system, we have 
identified the minimum seed dwell times needed to 
stabilize the system. The model includes a fixed seed, 
peers arriving with pieces, and seeds dwelling for a 
while as peer seeds after obtaining the complete file. 
It is a mathematical simplification of a P2P system 
during the period of several hours or days after a flash 
crowd initiation of a file transfer, when the arrival 
of new peers is relatively steady. Our result identifies 
the stability region under all possible rates of arrival, 
mean times between transfer attempts, and distribution 
of pieces brought in by new peers. For tractability, we 
assumed that the times between upload attempts and 
the dwell times of seeds are exponentially distributed 
random variables. However, we conjecture the results 
hold for more general distributions; the instability half of 



our proof does not rely on the assumption of exponential 
distributions. Theorem 1 and its extensions given here 
hold under the stated specific modeling assumptions. It 
is our hope that the results help with intuition and can be 
adapted to many other scenarios, including such effects 
as heterogeneous link speeds or network topologies other 
than the fully connected one. 

We summarize what can be taken away from our 
analysis, and point to future work. The first point is 
that stability can be achieved (within the confines of 
the model) if peers remain in the system a relatively 
short amount of time-no longer than the time needed to 
upload one piece after obtaining a complete collection. 
A second point is that network coding can significantly 
lessen the effect of the missing piece syndrome in the 
case that some peers are given pieces (random linear 
combinations of data pieces) upon arrival. 

A third point is that the stability condition is insensi- 
tive to the piece selection policy, and to network coding 
if peers don't arrive with pieces (i.e. no gifted peers). 
However, some systems that are provably unstable in 
the sense that they are modeled by transient Markov 
processes, can be well behaved over long periods of time 
in practice. There may be a quasi-stable portion of the 
state space in which the process dwells for a long time 
before the onset of a large one club occurs. The use of 
network coding or choice of piece selection policies can 
have a large impact on how long it takes the system 
to enter a state with a large one club-a possible study 
for future work would be to explore the longevity of a 
quasi-equilibrium in good network states. 

X. Appendix 

Miscellaneous results and a definition used in the main 
part of the paper are collected in this appendix. 

Proposition 18. Combined Foster-Lyapunov stability 

criterion and moment bound-continuous time (See [23, 
24].) Suppose X is a continuous-time, irreducible 
Markov process on a countable state space S with gen- 
erator matrix Q. Suppose V, f, and g are nonnegative 
functions on S such that QV{yi) < — /(x) + (?(x) for 
all X G 5, and, for some S > 0, the set C defined by 
C = {x : /(x) < ,9(x) + (5} is finite. Suppose also that 
{x : T^(x) < is finite for all K. Then X is positive 
recurrent and, if n denotes the equilibrium distribution, 
E^/(x)7r(x)<Exfl(x)7r(x). 

Lemma 19. Bounding the drift of a function of a 
function of the state Suppose X is a continuous-time, 
irreducible Markov process with countable state space 
S and with generator matrix Q = {q{x, x'), x, x' e «S). 
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Suppose / : 5 — > [0, oo) and : M — > [0, oo) are two 
nonnegative functions; and suppose V is dijferentiable 
with derivative V that is Lipschitz continuous with 
Lipschitz constant M. Then QV{f), the drift of V{f), 
satisfies 

QV{f){z) 

^ q{z,z')[V{f{z'))-V{f{z))] 

< V'ifiz))Qf{z) + 

Y E Q{z,z')[f{z')-f{z)f, 

for all z Cz S, where Qf is the drift of f. 
Proof: The lemma follows from: 

V{f{z'))-V{f{z)) 

rfi^') 

V'{x)dx 

[V'if{z)) + {V'{x)^V'ifizmdx 



Jf( 



< 



V'{f{z))dx + 



/ M\x-f{z)\dx 

Jf(z) 



M 



V'{f{z))[f{z')-f{z)] + '-^[f{z')-f{z)r. 



Definition 4. Stochastic domination or coupling 
Suppose A = {At : t > 0) and B = [Bt : t > Q) are 
two random processes, either both discrete-time random 
processes, or both continuous time random processes 
having right-continuous with left limits sample paths. 
Then A is stochastically dominated by B if there is a 
single probability space (fi, J^, P), and two random 
processes A and B on (fi, J^, P), such that 

(a) A, A have the same finite dimensional distributions, 

(b) B,B have the same finite dimensional distributions, 
and (c) P{At < Bt for all t} = 1. 

Clearly if A is stochastically dominated by B, then 
for any a and t, P{At >a}< P{Bt > a}. 

Proposition 20. Kingman's Moment bound adapted to 
compound Poisson processes ([25], see [10]) Let C be 
a compound Poisson process with Cq — 0, with jump 
times given by a Poisson process of rate a, and jump 
sizes having mean mi and mean square value m2- Then 
for all B > and e > ami 

P{Ct <B + etfor all t} > 1 ^ (57) 

2B{e — ami) 

Lemma 21. A maximal bound for an M/G7/oo queue 



([10]) Let M denote the number of customers in an 
M/GI/oo queueing system, with arrival rate A and 
mean service time m. Suppose that Mq = 0. Then for 
B,e>0, 

gA(m+l)2-B 

P{Mt >B + et for some t>Q}< 



l-2-« 



(58) 
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